Methods for categorisation
Under the Email and work item | Methods for categorisation menu choice in CallGuide Admin you can choose a method to search and categorise incoming emails and work items.
The two different categorisation methods are:
An email can consist of text, with or without formatting. A formatted email can e.g. constitute an html page that can also be opened in a web browser. CallGuide Email can convert the relevant content of html mails into text and can in this way search and categorise these as well, using the settings you have made. In the cases where the email consists of several alternative parts, as where an html page has been completed with the same information as text, only one part is searched.
Keyword based categorisation
Keyword-based categorisation means that a text will be searched in order to find certain keywords. These keywords are defined for a number of categories.
To use this method, select CallGuide Keyword in the drop down menu under Methods for categorisation.
In the Categories box you will find a number of categories which are already defined in your CallGuide system. Tick the categories you want to use.
Add a category
To add a new category, click the Add category… button.
Enter the category name in the entry field and click on OK.
You can now enter keywords for the new category by selecting it in the Category drop-down menu under the Change of Category header.
Keywords can either be downloaded from an existing text file using the Download from file button or be entered manually as a list in the Keywords box, found in the window’s bottom right part.
If you tick off the View help to add keyword box, you will see which various rules that are used to define the keywords. These are the rules:
- Categorisation is performed irrespective of upper and lower case letters. This means that ‘computer’ and ‘Computer’ are identical keywords and both keywords will match the words ‘computer’, ‘Computer’, ‘COMPUTER’, ‘compuTER’, etc.
- An asterisk (*) is used to get a keyword to match all words in the text beginning with a particular prefix. The keyword ‘computer*’ will match the words ‘computer, ‘computers, ‘computerisation’, etc.
- A plus sign (+) is placed before a keyword to indicate that the keyword is necessary, i.e. that the keyword must be present in the text. In other words, the keyword ‘+computer’ means that ‘computer’ is a necessary word.
- A minus sign (-) is placed before a keyword to indicate that the keyword is prohibited, i.e. that the keyword may not be present in the text. In other words, the keyword ‘-computer’ means that ‘computer’ is a prohibited word.
- An equals sign (=) is placed before a keyword in order to let special characters to be included in the keyword. The keyword ‘=+computer’ will match ‘+computer’ in the input text.
When the keyword list is finished, click on Save keyword. Note that you may also save the keywords to a file.
According to this example picture the words “der”, “das” and “golfplatz” belong to the “german” category. The “+” sign in front of “der” and “das” means that these words must be found in the email text in order for this category to be included in the overall categorisation result.
Settings for the selected method - Parameters
Under the Parameters heading to the right you enter the values for the max_categories and cut_off parameters respectively.
- The max_categories parameter
controls how many categories the CallGuide Email Text Categoriser may return. If the max_categories parameter for instance is set to 2 and an email contains 3 categories, the Text Categoriser will only return the two with the highest number of main points found, i.e. with the strongest weight. If, on the other hand, CallGuide only finds one category, this sole category will be returned. If you map categories to task types it is quite probable that a customer mentions more than one task in an email.
- The cut_off parameter
is used to sift out poor category matches, i.e. a way of removing a category that is not mentioned as much as other categories. The value for cut_offis a number between 0.0 and 1.0.
Example for the Swedish and English categories:
If there are 30 English words and one Swedish word in an email you may want the categorisation to only return English. Then you weigh all categories against the others.
If CallGuide finds 30 English words and one Swedish, the weight is 30 for English and 1 for Swedish, giving 1/30 = 0.0333. If you have set a cut_off for 0.1, Swedish will not be returned as a category since 0.0333 is less than 0.1.
If CallGuide instead finds 30 English words and 15 Swedish words, agents speaking both languages might be needed and you might want both languages to be returned. 15/30 = 0.5 and since 0.5 is greater than 0.1, Swedish will be returned as well.